20. scikit-image HOG

scikit-image HOG

Now that we've got a dataset let's extract some HOG features!

The scikit-image package has a built in function to extract Histogram of Oriented Gradient features. The documentation for this function can be found here and a brief explanation of the algorithm and tutorial can be found here .

The scikit-image hog() function takes in a single color channel or grayscaled image as input, as well as various parameters. These parameters include orientations , pixels_per_cell and cells_per_block .

The number of orientations is specified as an integer, and represents the number of orientation bins that the gradient information will be split up into in the histogram. Typical values are between 6 and 12 bins.

The pixels_per_cell parameter specifies the cell size over which each gradient histogram is computed. This paramater is passed as a 2-tuple so you could have different cell sizes in x and y, but cells are commonly chosen to be square.

The cells_per_block parameter is also passed as a 2-tuple, and specifies the local area over which the histogram counts in a given cell will be normalized. Block normalization is not necessarily required, but generally leads to a more robust feature set.

There is another optional power law or "gamma" normalization scheme set by the flag transform_sqrt . This type of normalization may help reduce the effects of shadows or other illumination variation, but will cause an error if your image contains negative values (because it's taking the square root of image values).

This is where things get a little confusing though. Let's say you are computing HOG features for an image like the one shown above that is 64\times64 pixels. If you set pixels_per_cell=(8, 8) and cells_per_block=(2, 2) and orientations=9 . How many elements will you have in your HOG feature vector for the entire image?

You might guess the number of orientations times the number of cells, or 9\times8\times8 = 576 , but that's not the case if you're using block normalization! In fact, the HOG features for all cells in each block are computed at each block position and the block steps across and down through the image cell by cell.

So, the actual number of features in your final feature vector will be the total number of block positions multiplied by the number of cells per block, times the number of orientations, or in the case shown above: 7\times7\times2\times2\times9 = 1764 .

For the example above, you would call the hog() function on a single color channel img like this:

from skimage.feature import hog
pix_per_cell = 8
cell_per_block = 2
orient = 9

hog_features, hog_image = hog(img, orientations=orient,
                          pixels_per_cell=(pix_per_cell, pix_per_cell), 
                          cells_per_block=(cell_per_block, cell_per_block), 
                          visualise=True, feature_vector=False,
                          block_norm="L2-Hys")

The visualise=True flag tells the function to output a visualization of the HOG feature computation as well, which we're calling hog_image in this case. If we take a look at a single color channel for a random car image, and its corresponding HOG visulization, they look like this:

The HOG visualization is not actually the feature vector, but rather, a representation that shows the dominant gradient direction within each cell with brightness corresponding to the strength of gradients in that cell, much like the "star" representation in the last video.

If you look at the hog_features output, you'll find it's an array of shape 7\times7\times2\times2\times9 . This corresponds to the fact that a grid of 7\times7 blocks were sampled, with 2\times2 cells in each block and 9 orientations per cell. You can unroll this array into a feature vector using features.ravel() , which yields, in this case, a one dimensional array of length 1764 .

Alternatively, you can set the feature_vector=True flag when calling the hog() function to automatically unroll the features. In the project, it could be useful to have a function defined that you could pass an image to with specifications for orientations , pixels_per_cell , and cells_per_block , as well as flags set for whether or not you want the feature vector unrolled and/or a visualization image, so let's write it!

# Define a function to return HOG features and visualization
# Features will always be the first element of the return
# Image data will be returned as the second element if visualize= True
# Otherwise there is no second return element

def get_hog_features(img, orient, pix_per_cell, cell_per_block, vis=True, feature_vec=True):

    # TODO: Complete the function body and returns
    pass

Note: you could also include a keyword to set the tranform_sqrt flag but for this exercise you can just leave this at the default value of transform_sqrt=False .

Start Quiz:

import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import cv2
import glob
from skimage.feature import hog

# Read in our vehicles
car_images = glob.glob('*.jpeg')
        
# Define a function to return HOG features and visualization
# Features will always be the first element of the return
# Image data will be returned as the second element if visualize= True
# Otherwise there is no second return element

def get_hog_features(img, orient, pix_per_cell, cell_per_block, vis=True, 
                     feature_vec=True):
                         
    # TODO: Complete the function body and returns
    pass

# Generate a random index to look at a car image
ind = np.random.randint(0, len(car_images))
# Read in the image
image = mpimg.imread(car_images[ind])
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)

# Call our function with vis=True to see an image output
features, hog_image = get_hog_features(gray, orient= 9, 
                        pix_per_cell= 8, cell_per_block= 2, 
                        vis=True, feature_vec=False)


# Plot the examples
fig = plt.figure()
plt.subplot(121)
plt.imshow(image, cmap='gray')
plt.title('Example Car Image')
plt.subplot(122)
plt.imshow(hog_image, cmap='gray')
plt.title('HOG Visualization')
def get_hog_features(img, orient, pix_per_cell, cell_per_block, vis=True,
                     feature_vec=True):
                         
    """
    Function accepts params and returns HOG features (optionally flattened) and an optional matrix for 
    visualization. Features will always be the first return (flattened if feature_vector= True).
    A visualization matrix will be the second return if visualize = True.
    """
    
    return_list = hog(img, orientations=orient, pixels_per_cell=(pix_per_cell, pix_per_cell),
                                  cells_per_block=(cell_per_block, cell_per_block),
                                  block_norm= 'L2-Hys', transform_sqrt=False, 
                                  visualise= vis, feature_vector= feature_vec)
    
    # name returns explicitly
    hog_features = return_list[0]
    if vis:
        hog_image = return_list[1]
        return hog_features, hog_image
    else:
        return hog_features